Indexing of Sequences of Sets for Efficient Exact and Similar Subsequence Matching

نویسندگان

  • Witold Andrzejewski
  • Tadeusz Morzy
  • Mikolaj Morzy
چکیده

Object-relational database management systems allow users to define complex data types, such as objects, collections, and nested tables. Unfortunately, most commercially available database systems do not support either efficient querying or indexing of complex attributes. Different indexing schemes for complex data types have been proposed in the literature so far, most of them being application-oriented proposals. The lack of a single universal indexing technique for attributes containing sets and sequences of values significantly hinders practical usability of these data types in user applications. In this paper we present a novel indexing technique for sequence-valued attributes. Our index permits to index not only sequences of values, but sequences of sets of values as well. Experimental evaluation of the index proves the feasibility and benefit of the index in exact and similar matching of subsequences.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ranked Subsequence Matching in Time-Series Databases

Existing work on similar sequence matching has focused on either whole matching or range subsequence matching. In this paper, we present novel methods for ranked subsequence matching under time warping, which finds top-k subsequences most similar to a query sequence from data sequences. To the best of our knowledge, this is the first and most sophisticated subsequence matching solution mentione...

متن کامل

On the Sequencing of Tree Structures for XML Indexing

Sequence-based XML indexing aims at avoiding expensive join operations in query processing. It transforms structured XML data into sequences so that a structured query can be answered holistically through subsequence matching. In this paper, we address the problem of query equivalence with respect to this transformation, and we introduce a performance-oriented principle for sequencing tree stru...

متن کامل

Prefix-querying with an L1 distance metric for time-series subsequence matching under time warping

This paper discusses the way of processing time-series subsequence matching under time warping. Time warping enables sequences to be found with similar patterns even when they are of different lengths. The prefix-querying method is the first index-based approach that efficiently performs time-series subsequence matching under time warping without false dismissals. This method employs the L dist...

متن کامل

Fast Retrieval of Similar Subsequences in Long Sequence Databases

Although the Euclidean distance has been the most popular similarity measure in sequence databases, recent techniques prefer to use high-cost distance functions such as the time warping distance and the editing distance for wider applicability. However, if these distance functions are applied to the retrieval of similar subsequences, the number of subsequences to be inspected during the search ...

متن کامل

Exact Mixed Integer Programming for Integrated Scheduling and Process Planning in Flexible Environment

This paper presented a mixed integer programming for integrated scheduling and process planning. The presented process plan included some orders with precedence relations similar to Multiple Traveling Salesman Problem (MTSP), which was categorized as an NP-hard problem. These types of problems are also called advanced planning because of simultaneously determining the appropriate sequence and m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005